In this paper, we develop and parallelize a CFD solver that supports overlapped meshes on multiple MIC architectures by using\nmultithreaded technique. We optimize the solver through several considerations including vectorization, memory arrangement,\nand an asynchronous strategy for data exchange on multiple devices. Comparisons of different vectorization strategies are made,\nand the performances of core functions of the solver are reported. Experiments show that about 3.16x speedup can be achieved for\nthe six core functions on a single Intel Xeon Phi 5110P MIC card, and 5.9x speedup can be achieved using two cards compared to\nan Intel E5-2680 processor for two ONERA M6 wings case.
Loading....